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Abstract — Information-centric networks make storage one of 
the network primitives, and propose to cache data within the 
network in order to improve latency to access content and reduce 
bandwidth consumption. We study the throughput capacity of an 
information-centric network when the data cached in each node 
has a limited lifetime. The results show that with some fixed 
request and cache expiration rates the network can have the 
maximum throughput order of 1 / y/n and 1 / log n in case of grid 
and random networks, respectively. Comparing these values with 
the corresponding throughput with no cache capability (1/n and 
1 / \Jn logn respectively), we can actually quantify the asymptotic 
advantage of caching. Moreover, since the request rates will 
decrease as a result of increasing download delays, increasing the 
content lifetimes according to the network growth may result in 
higher throughput capacities. 

I. Introduction 

In today's networking situations, users are mostly interested 
in accessing content regardless of which host is providing this 
content. They are looking for a fast and secure access to data 
in a whole range of situations: wired or wireless; heteroge- 
neous technologies; in a fixed location or when moving. The 
dynamic characteristics of the network users makes the host- 
centric networking paradigm inefficient. Information-centric 
networking (ICN) is a new networking architecture where 
content is accessed based upon its name, and independently 
of the location of the hosts 1 1 ]-| 4 |. In most ICN architectures, 
data is allowed to be stored in the nodes and routers within 
the network in addition to the content publisher's servers. This 
reduces the burden on the servers and on the network operator, 
and shortens the access time to the desired content. 

Combining content routing with in-network- storage for the 
information is intuitively attractive, but there has been few 
works considering the impact of such architecture on the 
capacity of the network in a formal or analytical manner. 
In this work we study an information-centric network where 
nodes can both route and cache content. We also assume that 
a node will keep a copy of the content only for a finite period 
of time, that is until it runs out of memory space in its cache 
and has to rotate content, or until it ceases to serve a specific 
content. 

The nodes issue some queries for content that is not 
locally available. We suppose that there exists a server which 
permanently keeps all the contents. This means that the content 



is always provided at least by its publisher, in addition to the 
potential copies distributed throughout the network. Therefore, 
at least one replica of each content always exists in the network 
and if a node requests a piece of information, this data will be 
furnished either by its original server or by a cache containing 
the desired data. When the customer receives the content, it 
will store the content and share it with the other nodes if 
needed. 

The present paper thus investigates the throughput capacity 
in such content-centric networks and addresses the following 
questions: 

1) Looking at the throughput capacity, can we quantify the 
performance improvement brought about by a content- 
centric network architecture over networks with no con- 
tent sharing capability? 

2) How does the caching policy, and in particular, the 
length of time each piece of content spends in the 
cache's memory, affect the capacity? 

We state two Theorems below. Theorem Q] will answer the 
first question studying two different network models (grid 
and random network) and two content discovery scenarios 
(shortest path to the server and flooding). Theorem [2] derives 
some conditions on the respective request rate (namely, the 
popularity of the content) and the time spent in the cache, 
so that these throughputs can be supported by all the nodes 
and the flow in no node be a bottleneck. These theorems 
demonstrate that adding the content sharing capability to the 
nodes can significantly increase the capacity. 

Theorem 1. Consider a wireless network consisting of n 
nodes, with each node containing the information in its local 
cache with probability p. Each node can transmit at W bits 
per second over a common wireless channel shared by all 
nodes. 

• Scenario i- If the nodes are located on a grid and 
search for the contents just on the shortest path toward 
the server, the maximum achievable throughput capacity 
orde^\ is 

l f(n) = 0(g(n)) if sup n (f(n)/g(n)) < oo. f(n) = Q(g(n)) if 
g(n) = 0(f(n)). f(n) = S(g(n)) if both f(n) = 0(g(n)) and 
f(n) = n(g(n)). 



B. Network Model 
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Scenario ii- If the nodes are located on a grid and use 
flooding as their content search algorithm, the maximum 
achievable throughput is 



/max — \ , w . 



Scenario Hi- If the nodes are randomly distributed over 
a unit square area and use path-wise content discovery 
algorithm, the maximum achievable capacity is 
w 
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Theorem 2. Assume m networks of Theorem^the content 
request and dropping rates are A and \i, respectively. The 
throughput capacities of Theorem y\ are supportable if \ ■ = 



0( »i^n) /„ , cenano iif = 0( «g ) in 

scenario Hi. 

The rest of the paper is organized as follows. After a brief 
review of the related work, the network models, the content 
availability and the content discovery algorithms used in the 
current work are introduced in Section HH Theorems Q] and 



are proved in Sections [nl| [TVj respectively. Finally we will 
discuss the results in SectionTvl 

II. Preliminaries 

A. Related Work 

Many aspects of ICN networks have been studied in prior 
works [3]. Some performance metrics like miss ratio in the 
cache, or the average number of hops each request travels to 
locate the content have been studied in (5), (6). 

Optimal cache locations (7) and cache replacement tech- 
niques 1 8 ] are two other aspects most commonly investigated. 
And an analytical framework for investigating properties of 
these networks like fairness of cache usage is proposed in (9). 
(TO) considered information being cached for a limited amount 
of time at each node, as we do here, but focused on flooding 
mechanism to locate the content, not on the capacity of the 
network. 

However, to the best of our knowledge, there are just a few 
works focusing on the achievable data rates in such networks. 
(TTJ uses a network simulation model and evaluates the per- 
formance (file transfer delay) in a cache-and-forward system 
with no request for the data. [12] proposes an analytical model 
for single cache miss probability and stationary throughput in 
cascade and binary tree topologies. (13) considers a general 
problem of delivering content cached in a wireless network and 
provides some bounds on the caching capacity region from an 
information-theoretic point of view. Some scaling regimes for 
the required link capacity is computed in p4) for a static cache 
placement in a multihop wireless network. 



Two network models are studied in this work. 

I ) Grid Network: Assume that the network consists of n 
nodes V = {^1,^2, v n } each with a local cache of size L 
located on a grid (Figure [I]). The distance between two adja- 
cent nodes equals to the transmission range of each node, so 
the packets sent from a node are only received by four adjacent 
nodes. There are m different contents, F = /b? /m} 
with sizes i = l,...,m, for which each node Vj may 
issue a query. Based on the content discovery algorithms 
which will be explained later in this section, the query will 
be transmitted in the network to discover a node containing 
the desired content locally. Vj then downloads b bits of data 
with rate 7 in a hop-by-hop manner through the path P x j from 
either a node (vi, x = i) containing it locally (/ G vi) or the 
server (x = s). When the download is completed, the end user 
stores the data in its local cache and shares it with other nodes. 
Pj S ->i denotes the nodes on the path from Vj to server before 
reaching node V{. Without loss of generality, we assume that 
the server is attached to the node located at the middle of the 
network, changing the location of the server does not affect the 
scaling laws. Using the protocol model and according to (15) 
the transport capacity in such network is upper bounded by 
@(Wy/n). This is the model studied in the first two scenarios 
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Fig. 1. The transmission range of node v contains four surrounding nodes. 
The black vertices contain the content in their local caches. The arrow lines 
demonstrate a possible discovery and receive path in scenario i, where node 
v downloads the required information from u. In scenario ii, v will download 
the data from w instead. 

of Theorem Q] 

2 ) Random Network: The last network studied in Theorem 
[T] is a more general network model where the nodes are 
randomly distributed over a unit square area according to a 
uniform distribution. We use the same model used in (15) 
(section 5) and divide the network area into square cells each 
with side-length proportional to the transmission range r(n), 

which is selected to be at least in the order of y^^p to 
guarantee the connectivity of the network (16). According 
to the protocol model (15), if the cells are far enough they 
can transmit data at the same time with no interference; we 
assume that there are M 2 non-interfering groups which take 
turn to transmit at the corresponding time- slot in a round 
robin fashion. The server is assumed to be located at the 
middle of the network. In this model the maximum number 



of simultaneous feasible transmissions will be in the order of 
^2^y as each transmission consumes an area proportional to 

r 2 (n). 

All the other assumptions regarding the contents, requests, 
and time-out durations are similar to the grid network. 

C Content Discovery Algorithm 

1) Path-wise Discovery: To discover the location of the 
desired content, the request is sent through the shortest path 
toward the server containing the requested content. If an 
intermediate node has the data in its local cache, it does 
not forward the request toward the server anymore and the 
requester will start downloading from the discovered cache. 
Otherwise, the request will go all the way toward the server 
and the content is obtained from the main source. In case of 
the random network when a node needs a piece of information, 
it will send a request to its neighbors toward the server, i.e. 
the nodes in the same cell and one adjacent cell in the path 
toward the server, if any copy of the data is found it will be 
downloaded. If not, just one node in the adjacent cell will 
forward the request to the next cell toward the server. 

2) Flooding/Ring Search: In this algorithm the request for 
the information is sent to all the nodes in the transmission 
range of the requester. If a node receiving the request contains 
the required data in its local cache, it notifies the requester 
and then downloading from the discovered cache is started. 
Otherwise, all the nodes that receive the request will broadcast 
the request to their own neighbors. This process continues 
until the content is discovered in a cache and the downloading 
follows after that. 

3) Content Distribution in Steady-State: The time diagram 
of data access process in the studied network is illustrated 
in Figure [2] When a query for content fa is initiated, the 
content is available at the requester's cache after a wait time 
(T 3 ) which is a function of the distance between the user and 
the data source (server or an intermediate cache), the data 
size, and the download speed. The expiration timer will be 
reset upon receiving the data and this data will be dropped 
after an exponentially distributed holding time (Ti) with mean 

The user may re-issue a query for that data after another 
exponentially distributed time (T 2 ) with mean 1/A^. The solid 
lines in this diagram denote the portions of time that the data 
is available at local cache. 



In this work we assume identical content sizes B{ = B, 
and assume all the contents have the same popularity leading 
to similar request rates A^ = A, and the same time-outs pi = 
p. As the requests for different contents are supposed to be 
independent and time-outs are set for each content independent 
of the others, we can do the calculations for one single content. 
If the total number of contents is not a function of the network 
size, this will not change the capacity order. Suppose that B is 
much larger than the request size, so we ignore the overhead 
of the discovery phase in our calculations. Furthermore, if the 
information sizes are the same and the download rates are also 
the same, the download time will be a function of the number 
of hops (h) between the source and the customer; T3 = Bh/j. 
In the steady-state analysis, we ignore this constant time. 

We assume that each node generates a request for a content 
according to an exponential process with mean inter-arrival 
time 1/A if it does not have it in its local cache. On the other 
hand, there is a time-out timer for each cached content which 
will reset upon receiving a content and times-out according to 
an exponential process with average time 1/p. Therefore, each 
node's cache states for each piece of information are changing 
according to a Markov process with two states and 1, and 
transition rates A for change from state to 1, and p from 1 
to 0. 

The average portion of time that each node contains a 
content in its local cache is 



1/p 



A 



1//X + 1/A A + /i' 



(1) 



which is the average probability that a node contains the data 
at steady- state. 



III. Proof of Theorem[T] 

In this section, we prove Theorem [T] by utilizing some 
lemmas. 

Lemma 1. Consider wireless networks described in Theorem 
[T] For sufficiently large networks and when p is large enough 
(p = £7(n -1 / 2 ) for case and p = ^(log -1 n) for case 
Hi), the average number of hops between the customer and 
the nearest cached content location is 



T 3 ~f H>B (h,b) Ti~Exp(n) T 2 ~Exp(X) T 3 ~f H; B(h,b) Tj-ExpO) time 
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Fig. 2. Data access process time diagram in a cache network 



Proof: Scenario i- The probability that the first node on 
the path from the requester (v) to the server contains the 
content in its local cache is p, and the probability that the 
closest node to the customer on the query path caching the 
requested information is the h th node is (1 — p) h ~ 1 p. Thus 
the average number of hops between the customer and the 



nearest cached content location is 

h 



Y^hP(H = h) 

h=l 



h=i 

Large n j 



e,-) 



(3) 



The result is valid for p = Q(n x / 2 ), and for lesser values 
of p we will have h = Q(y / n). 

Scenario H - The probability that the discovered cache is 
located at a distance of one hop from the requester is the 
probability that one of the nodes on the ring at one hop 
distance contains the data (it consists of 4 nodes), which equals 
to 1 — (1 — p) 4 , and the probability that the data needs to 
travel through h hops from the discovered cache to where it 
is required is (1 — (1 — p) 4h ) rifc=i(l — P) 4k as there are 4/i 
nodes at distance of h hops. Therefore, 



h-1 



j2h(i-(i- P r h )H(i- P r 
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Large n I 

= 9( — - — ) 
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(4) 



where the last equality is correct when p = ft(n 1 / 2 ). For 
smaller p's, h will increase to 9(y / n). 

Scenario Hi - The discovered cache is one hop away from 
the requester if there is a replica of the data in a cache at 
the same cell or at the adjacent cell toward the server. So 
since there are log n nodes in each cell, the probability of the 
discovered cache being at one hop distance is 1 — (1 — p) 2 log n , 
and the probability of the discovered cache being at distance of 
h hops away from the requester is ( 1 - p) h log n ( 1 - ( 1 - p) log n ) . 
The maximum number of hops that may be traveled this way 



is ~r^. Thus 

r(n) 



h = l - (l - p) 



2 log n 



£^Mi-^ log "(i-(i-p) los ") 

Large n 

- 9(1) 



(5) 



where the last equality is correct when p = £l(log 1 n). ■ 
In scenario Hi the average number of hops between the 
nearest content location and the customer is just 9(1) hop. 
This is the result of having log(n) caches in one hop distance 
for every requester. Each one of these caches can be a potential 
source for the content. When the network grows, this number 
will increase and if p is large enough (p — £7 (log -1 n)) the 
probability that at least one of these nodes contain the required 
data will approach 1, i.e., lim n ^ oc (l — (1 — p) logn ) = 1. 

Lemma 2. The average probability that the server needs to 



serve a request is 




(i) 
(ii) 

(Hi) 



(6) 



Proof: Scenario i- The data will need to be downloaded 
from the server (at average distance h s ) if no copy of the 
data is available on the path between a requester node and the 
server. As the network area is assumed to be a square and the 
server is in the middle of it, this probability is bounded by 



Thus for large n, p s 



< Ps 



Note that both h s and h max , the maximum number of hops 
which may be traveled between the requester and the node that 
possesses a valid copy of data, in this scenario are 9(y / n). 

Scenario ii- The data will need to be downloaded from the 
server (at average distance h s ) if no copy of the content is 
available in the network caches. Since comparing to scenario 
i more nodes will be involved in the process of content discov- 
ery, it is obvious that in this case the request will be forwarded 



rip 2 



to the server with less probability. Thus p s = 0( 

Scenario Hi- The data is downloaded from the server if no 
node in the cells on the path toward the server cell contains a 
copy of the content. 
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It can be seen that in all cases the average number of hops 
between the server and the node requesting the content is a 
function of the total number of nodes in the network and p. 

Here we can prove Theorem [7] using the above lemmas. 
Proof: Assume that each content is retrieved with rate 7 
bits/sec. The traffic generated because of one download from 
a cache at average distance of h hops from the requester node 
is and the traffic generated due to the downloads from 
the server at average distance of h s hops from the requester is 
jh s . The probability that the server is uploading the data is p s 
and the probability that a cache node is serving the customer 
is p = l—p s . The total number of requests for a content in the 
network at any given time is limited by the number of nodes 
not having the content in their own cache ((1 — p)n). Thus 
the maximum total bandwidth needed to accomplish these 
downloads will be (1 — p)n(ph + p s h s )j, which is upper 
limited by (Q(W^/n)) in scenarios i, ii, and (9(^^y)) in 
scenario Hi. 

Therefore the maximum download rate is 

W^n/n(l-p) 
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maximum download rate would be -ffi- 
ii and 6( 11 



The results of Theorem [T] can be derived by approximating 
these equations for sufficiently large n. Note that if there were 
no cache in the system, or p is less than the stated threshold 
values, all the requests would be served by the server, and the 

= 0(^7) in case i, 

V • • • • .S 11 

z— i ) in case Hi. — 

IV. Proof of Theorem [2] 

In the previous section the maximum throughput capacity 
in a cache wireless network has been calculated. Now it is 
important to verify if this throughput can be supported by 
each cell (node), i.e. the traffic carried by each cell (node) is 
not more than what it can support (6(1)). Here we start with 
scenario in and a complete proof. Scenario i will be then 
briefly studied. Similar reasoning can be used for scenario ii. 

Proof: Scenario Hi- The traffic load at the server is 
lmaxPsn{l — p) = 0(1). So the flow at the server will not be 
a bottleneck. 

The traffic load at a node as a customer will not be a 
bottleneck either as it does not exceed the maximum data rate 
which is j max = 9( logn ^_ p) ) < 6(1). 

To compute the traffic load at a node which is serving a 
request, we need to know how many requests that node may 
serve at a time. A node Vi G V is the download source if it has 
the information (p), it is in the same cell as the requester or in 
a cell on the path from the requester to the server and no node 
in the previous cells on this path contains the content ((1 — 
p)^iogn w h ere x [ s th e number of hops between Vj and Vi), 
and among those nodes in the same cell which have the data 
V{ is selected to serve the query (Y^kii 'kCk-i~ 1 )P k ~ 1 (l ~ 
p) logn ~ k ). For not too small p and large n, we have 

P(vi is serving v'aS request) — 



1 

log n ' 

(l-p) logr 
log n 

(1-P) h+1< 
log n 

0, 



Vj and Vi are in the same cell 



1 & Vi G P h 



1 < hji = h < 
otherwise 



Vi G Pj. 



(9) 



Therefore, each node containing the content will serve 
only the nodes in the same cell with high probability, and 
the probability of being selected to serve the query initiated 



at the same cell is 



p log n 



. Based on the bin-balls Theorem 



JT7) , the maximum number of queries served by a node will 
be log log n Consequently, the maximum traffic load per 

log log log n n J ' r 

source is -v log log n - Wl °s lo s n Therefore 

SOUICe IS -f m ax log log log n ~ (1 - p ) log n log log log n ' 

to make sure that all the cells can support the stated throughput 
^ = 1 + X/p is not allowed to exceed log n . 

Finally each download of information will generate a traffic 
load on all the intermediate cells on the path from the source to 
the customer. However as stated in section [TTTJ the probability 
that the required content is discovered at distance of one hope 
is 1 — (1 — p) 21ogn which is almost one for large n. So 
we may conclude that with high probability in sufficiently 
large networks no cell is working as relay or the number of 
transmissions passing through a cell as relay is close to zero. 



Scenario i- Similar to scenario Hi, the maximum traffic load 
is the load generated in a node when serving the requests. Here 
there are n(l — p) requests which will be served by np other 
nodes, so according to the bin-balls Theorem the maximum 
requests for a node will be in the order of lo ^ g - , which 



generates f1 ^ logn 

b n(l — p 



WX log n 



traffic at the busiest 



log n np log log n 

node. This traffic does not exceed 6(1) since the maximum 
requests to drop rate in this case is n n . ■ 

V. Discussion and Future Work 

We studied the impact of caching with limited lifetime on 
the maximum capacity order in the grid and random networks 
where the received data is stored at the receivers and is shared 
with the other nodes as long as the node keeps the content. Fig- 
ure [5] (a) shows the maximum throughput order for X/p = 7 
as a function of the network size. According to Theorem [T] and 
as can be observed from this figure, the maximum throughput 
capacity of the network in a grid network with the described 
characteristics is inversely proportional to the square root of 
the network size if the request rate and the cache timeout 
times are fixed. Similarly in the random network the maximum 
throughput is inversely proportional to the logarithm of the 
network size. 

On the other hand with a fixed network size, if the ratio 
order of the request rate to the content time-out rate is greater 
than a threshold (6(n -1 / 2 ) in cases i,H and 6(log _1 n) in 
case Hi), most of the requests will be served by the caches and 
not the server, so increasing the request rate or decreasing the 
time-out rate will increase the probability of an intermediate 
cache having the content and reduces the number of hops 
needed to forward the content to the customer, and conse- 
quently increases the throughput (Figure [3] (b)). For request 
to time-out ratio orders less than these thresholds most of the 
requests are served by the main server (p s approaches 1), so 
the maximum possible number of hops will be traveled by each 
content to reach the requester and the minimum throughput 
capacity (6(^y) in cases and 8( Vnlog ^ (1 _ p) ) in 
case Hi) will be achieved. 

Figures]?] (a), (b) respectively illustrate the total request rate 
and the total traffic generated in a fixed size network in 
scenario i for different request to time-out rate ratios. The total 
request rate in the network is the product of the number of 
requesting nodes and the rate at which each node is sending the 
request. The total traffic is the product of the total request rate 
and the number of hops between source and destination and 
the content size. Small X/p means that each node is sending 
requests with low rate, so fewer nodes have the content and 
more nodes are sending requests. In this case most of the 
requests are served by the server. The total request rate will 
increase by increasing the per node request rate. High X/p 
shows that each node is requesting the content with higher 
rate, so the number of cached content in the network is high 
and fewer nodes are requesting the content. Here most of the 
requests are served by the caches. The total request rate then is 
determined by the content drop rate. So for very large X/p, the 



total request rate is the total number of nodes in the network 
times the drop rate (n/i) and the total traffic is nfiB. 

However, when the network grows the traffic in the network 
will increase and the download rate will decrease. If we 
assume that the new requests are not issued in the middle 
of the previous download, the request rate will decrease with 
network growth. If the holding time of the contents in a cache 
increases accordingly the total traffic will not change, i.e. if by 
increasing the network size the requests are issued not as fast 
as before, and the contents are kept in the caches for longer 
times, the network will perform similarly. 

Furthermore, if the ratio of request to content drop rates 
increases with network growth, higher throughput capacities 
may be achievable. For example in scenario Hi if ^ = 

e ( iogi S ogn )' then the resultin g throughput will be j max = 
©(l^ifg^) » °(k^)- Note that according to Theorem 
j5J ^ is upper bounded by some values, so the achievable 

capacity will be upper bounded by 9( ^ V f Q 1 ^ logn ) (i) and 
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Fig. 3. Maximum download rate (^fmax) vs. (a) the number of nodes (n), 
(b) the request to time-out rate ratio (A//i). 

In this work we have made several assumptions to simplify 
the analysis. For example, we assumed all the contents have 
the same characteristics (size, popularity). These assumptions 
will be relaxed in future work. We also assumed that the data 
is cached just in the receiver, and the requester downloads the 
data completely from one nearest content location. However, 
if all the intermediate nodes are allowed to cache the data 
and share it (as in the model of |T8) for instance), or the node 
that needs the data can download each part of it from different 
nodes and makes a complete content out of the collected parts, 
higher capacities may be achievable. Proposing a caching and 
downloading scheme that can improve the capacity order is 
part of our future work. 
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